DOCUMENT RESUME 



ED 395 963 



TM 025 101 



AU THOR 
TITLE 



INSTITUTION 
REPORT NO 
PUB DATE 
NOTE 

PUB TYPE 



Holland, Paul W. 

A Note on the Covariance of the Mantel-Haenszel 
Log-Odds Ratio Estimator and the Sample Marginal 
Rates. Program Statistics Research Technical Report 
No. 89-85. 

Educational Testing Service, Princeton, N.J. 
ETS-RR-89-19 
Feb 89 
17p * 

Reports ~ Evaluative/Feasibility (142) 



EDRS PRICE MF01/PC01 Plus Postage. 

DESCRIPTORS *Di ff ic ’1 ty Level ; ^Estimation (Mathematics); ’'Item 

Bias; *Risk; Sampling; *Test Items; Test Use 
IDENTIFIERS Item Bias Detection; ’"Mantel Haenszel Procedure 



ABSTRACT 

A simple technique, developed by A. Phillips (1987) 
is used to approximate the covariance between the Mant e 1 -Haensze 1 
1 og-odds-ra t i o estimator for a 2 x 2 x k table and the sample 
marginal proportions. These results are then applied to obtain an 
approximate variance estimate of an adjusted risk difference based on 
the Mant el-HaensZel odds-ratio estimator. The adjusted risk 
difference is of potential value in these applications where at least 
one of the sample rates is descriptive of a relevant population rate. 
The example applies to the use of the Mantel Haenszel estimator to 
study the differential difficulty of test questions across groups of 
examinees. (Contains 11 references.) (Author/SLD) 






Reproductions supplied by EDRS are the best that can be made 
from the original document. 



I 



cn 

\o 

c\ 

in 

Cn 

m 

Q 

UJ 



u.*. OCPAKTMENTOf COOOTtOM 

Otfica o < Educattonai Research tod Improvamenl 

educational resources information 

r CENTER (ERIC) 



trTMS document has been reproduced u 
received from the person or organization 
originating ,l 

□ Minor Changes have been made to improve 
reproduction quality 



e Points of view or opinions stated m 'his docu- 
ment do not necessarily represent official 
OERI position or policy 



•PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



fJ > /■ 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC). ' 



RR-89-19 



A Note on the Covariance of the 
Mantel-Haenszel Log-Odds Ratio Estimator 
and the Sample Marginal Rates 



Paul W. Holland 




PROGRAM 

STATISTICS 

RESEARCH 



TECHNICAL REPORT NO. 89-85 



BEST COPY AVAILABLE 

^ EDUCATIONAL TESTING SERVICE 

^ PRINCETON, NEW JERSEY 08541 

ERIC 



2 



A NOTE ON THE COVARIANCE OF THE MANTEL-HAENSZEL LOG-ODDS-RATIO 
ESTIMATOR AND THE SAMPLE MARGINAL RATES 



Copyright 



O 

ERLC 



Paul W. Holland 



Program Statistics Research 
Technical Report No. 89-85 



Research Report No. 89-19 



Educational Testing Service 
Princeton, >T ew Jersey 08541-0001 



February 1989 



1989 by Educational Testing Service. All rights reserved. 



J 

o 



The Program Statistics Research Technical Report Series is 
designed to make the working papers of the Research Statistics Group 
at Educational Testing Service generally available. The series con- 
sists of reports by the members of the Research Statistics Group as 
well is their external and visiting statistical consultants. 
Reproduction of any portion of a Program Statistics Research Technical 
Report requires the written consent of the author(s). 



O 

ERIC 



ABSTRACT 



A simple technique, developed in Phillips (1937), is used to 

a A A 

Cov(8^h, p^) i * 1,2 where 0^ is the Mantel-Haenszel log-odds-rat 
for a 2x2xK table and the p^ are the sample marginal proportions, 
are then applied to obtain an approximate variance estimate of an 
difference based on the Mantel-Haenszel odds-ratio estimator. 



approximate 
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These results 
adjusted risk 
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1 . INTRODUCTION 

Consider the standard 2x2xK table whose 2x2 layer is specified below. 





1 


0 


Total 


Group 1 


A k 


Bk 


n lk 


Group 2 


Ck 


Ok 


n 2k 


Total 


ra lk 


ra 0k 


t k 



Probability models for the 2x2xK table include the two Binomial model (2B), 
Hauck, Anderson & Leahy, (1982) and the non-central (or "extended " ) 
hypergeometric model (NCH), Breslow, (1981). In both models the different 2x2 
layers of the 2x2xK table are statistically independent. In the 2B model, 
and are independent binomial variates, with ~ B(p^, nj^) and C^_ ~ ^(P2k> 
n2k)‘ In the NCH model, A^ has the non-central hypergeorae tr ic distribution 
given in (1.1) in which the margins nj^, raj^ and t^ are yarded as fixed, and 
is the non-centrality parameter. 

Prob(A k = a|^ k , n lk , m lk , t k ) - a ( n lk) ( t k-"lk)/ D) (1.1) 



where 



° - x <%>" <"^><‘^5) • 



( 1 . 2 ) 



In (1.2), the range of summation is given by 

raax(0,m^ + n^ - t^) < u ^ min(ra^i c , n^) . (1-3) 

Note that in (1.1) the integer, a, is also subject to the inequalities in (1.3). 
When ^ * 1, (1.1) reduces to the usual hypergeoraetr ic distribution. The NCH 
model may be viewed as the conditional distribution of A^ in the 2B model given 
the total A^ + C^ = raj^. 
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The common odds-rat io assumption may be expressed in both the 2B and the 
NCH models. In the 2B model it is expressed by assuming that the odds-ratio, 



2 



i • e . y 

'i'k “ (Plk/( 1 -Plk))/(P2k/( 1 “P2k)). 

does not depend on k, i.e. , ^ - ty. 

The coramon-odds-rat io assumption may be expressed in the NCH model by the 
assumption that (1*1) does not depend on k. 

The Mantel-Haenszel ( 1959) estimator, f° r ^ under the common odds-ratio 

assumption is defined by 

$MH “ <Z A k D k /t k )/(l B k C k /t k ) . (1.4) 

k k 

It is often convenient to work with the natural log of i.e., 

8 * 1 n ( 40 



and the corresponding estimator 

A 

8 M h - inC'I'MH) • 

The two marginal rates (or risks) are the sample proportions: 

Pi * (I A k) / n l > 
k 

and 

P2 " (E C k )/n 2 , 

k 

where 



(1-5) 



( 1 . 6 ) 



n l 



l n lk and n 2 = £ n 2k . 
k k 



The main results of this note are simple asymptotic expressions for the 

i A A 

covariance between B^jh and p^ and p 2 that are valid for both the 2B and the NCH 
models. My approach is to exploit two useful formulas ((1.8) and (1.9) below) 




o 
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that were developed in Phillips (1987) and given in Phillips and Holland (1987). 
These are summarized in the following two lemmas. 

Lemma 1 : Let R - £ and S * £ and let r * E(R) and s * E(S) 

k k 

where these expectations are taken with respect to e ithe r the 2B or the NCH 
models . Define b£ 



0 MH 



9 + R ~ r S-S 
r s 



(1*7) 



Then 



6 



MH 



f? . Var(R) Var(S). n ^ 

B MH + °p( — 9 — + 9 '• U-®) 

r z 



The virtue of (1.8) is that it allows the non-linear to be approximated 

by which is linear in R and S. In addition, (1.8) shows the way in which 

— ' A 

the distribution of 8 ^h approximates — namely, that the quantity 

Va r ( R Va r ( S 'i 

2 — + 5 — must be small. Phillips and Holland ( 1987) show how this condi- 

r s 

tion is satisfied in both the "large stratum" and "sparse data" situations. 

Lemma 2 : Let x^) M x(x-l ) . . . (x-k+1 ) denote the descending factorial . If d, 0 , 

Y, S are non-negative integers and 6 _is any non-negative integer not exceeding 
rain(0(,$) then under either the 2B 0 £ the NCH model we have 

E(A, (a) C<Y) Df 5 )) - ^ E(A( a_e ) Bf 9+£ ) C (Y+e) d,( S - £ )). (1.9) 

kkkk kk k k k 



Equation (1.9) may be used to establish useful relationships between various 
covariances that involve A^> B^, C^. and D^. This is illustrated in the proof of 
Theorem 1. 

Theorem 1^: Under t he common odds-rat io assumption and either the 2B or the NCH 
model 

(a) Cov( 8 mh , - 1/ni, 
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(b) Cov(0j4j$, P 2 ) - - 1/n 2* 

^ A 

From Lemma 1 we may use 0^ as an approximation to 0^ and thereby use 

A A A A 

Theorem 1 to obtain asymptotic approximations to Cov(0^, p^) and Cov(9^, P 2 ) 
under the 2B and NCH models. In section 2, I discuss an application of these 
covariance calculations. 

Proof of Theorem 1. From the definitions made in Lemma 1, 

Cov ( 0 ttH> Pi) ” Cov(0 + — - — , pi) - Cov(— - pi) 

- ~ fCov(R.Pi) - ^ Cov(S , p i ) } . (1.10) 

r a a 

Hence we need expressions for Cov(R, p^) and Cov(S, Pi). 

It is well-known (and an easy application of Lemma 2 with d * S * 1 , 3 = Y * 0 , 

and € » 1) that under either the 2B or the NCH models we have 

E(A k D k ) » <|» k E(B k C k ). (1.11) 

Then (1.11) may be used to show that under the common odds-ratio assumption 

r - E(R) - \J) E(S) - vj) s, 
or 

f - V' (1.12) 

Parts (a) and (b) of Theorem 1 are proved in a similar manner so I will consider 
only (a) . 

We need expressions for Cov(R, p^) and ^CovCS, p[). But 

Cov(R, pi) - X T Cov(A k D k , Pi) 
k tk 

- I J £ Cov(A k D k) Aj ). 

k.j ' K L 




f 0 



( 1 . 13 ) 
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But, for k ^ j, the variates are independent so that (1.13) reduces to 
Cov(R, P!> - 1 l 7 {E(A k D k ) - E(A k D k ) E(A k ) } . 



ni k tk 



Similarly we have 



tCov(S, pi) - - E 7 {'i'ECA^kCi,) - 4<E(B k C k ) E ( A k ) } . 



n l k t k 



(1.14) 



(1.15) 



But 

E(A k D k ) - E(A k (A k -l)D k ) + E(A k D k ) 

- E(A k 2 ) D k ) + E(A k D k ) . • (1.16) 

Now apply Lemma 2 to (1.16) with 0-2, (3-Y-O, 5-1 and £ - 1 and obtain 



E(A k D k) - ^E(A k B k C k ) + E(A k D k ). 

Hence, from (1.17) and (1.11), (1.14) becomes 

Cov(R, p x ) - 1 E T {^(A k B k C k ) - ^E(B k C k ) E(A k ) + E(A k D k )} 
ni k tk 

- v()Cov( S , p x ) + 7 E £ E(A k D k ). 

1 k k 



Thus, combining (1.19) with (1.10) and (1.12) we have 

Cov(0 MH , Pi) = 7 l 7 k S(A k D k )] -11 r - 1. 



QED 



(1. 17) 

(1. 18) 
(1.19) 
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2, A THEORETICAL APPLICATION AND EXAMPLE 

In this section we develop the necessary formulas for applying the Taylor 

series- or S-raethod to obtain standard errors for functions of p^ and P 2 

and then apply these results and those of section 1 to obtain an approximate 

A A 

standard error for a particular function of 8 MH and P 2 that arises in the use of 
the Mantei-Haenszel estimator to study differential difficulty of test questions 
across groups of examinees (Holland, 1985; Holland and Thayer, 1988). 

2.1 A general formula 

A A A 

_ A A A A 

Let f - f ( °MH > Pl> P2) be a dif f erentable function of 0 MH , and p 2 - The 
S-method (Bishop, Fienberg and Holland, 1975) may be used to derive the large- 

A 

sample variance of f. It is summarized below. 

A A A A 

Theorem 2 : As the variances of Pi anc * P2 JL2. zero , the variance of f is 

approximated by : 

V ar(f) - (|f) 2 Var(0 MH ) + (^) 2 Var(p 1 ) 

+ ( lp 2 )2 Var( P2) + 2 |§ Pi) (2.1) 

9f9f A A f A» 

+ 2 80 9i 2 Cov(0MH ’ ^ 2) + 2 8^ 8^ 2 Cov( Pl> P2 ) • 

From Theorem 2 the covariances derived in section 1 may be combined with 

A A A 

variance estimates of p^ and P 2 and the covariance of pj_ and P 2 to yield 

A 

an approximate standard error for f. Robins, Breslow, and Greenland (1986) 

A A A 

give an estimator, 3 2 (8^pj), °f the variance of that is valid in a variety of 

asymptotic situations. This variance estimator may be expressed in the notation 
of Lemma 1 as 

o2 (®MH) ” £ E + 'I'mh B k c k J [ A k + D k + 4*MH( B k + c k)] • 

k 





( 2 . 2 ) 
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This variance estimator is also discussed in Phillips and Holland (1987). 

A A 

The following lemma summarized the variances and covariance of pj_ and P2 in 

the ?B and NCH cases. 

Lemma 3 : (a) In the 2B case : 

VarCpi) - 1 X w ik p ik (l-Pilc) (2.3) 

ni k 



where Wi k * n ik/ n i> an< j, 

A A 

CovCpi, p 2 ) » 0. 

(b) Iri the NCH c ase : 

Var ( p i ) - 1 Var ( A k ) 

i k 



'2.4) 



(2.5) 



and 

Cov(p 1 , p 2 ) - - E Var (A k ) . (2.6) 

In Theorem 3, part .(a), it is clear that estimates of the variances and 

A A 

covariances of p]_ and P 2 under the 2B model are straightforward. For example, 

^k/ n lk anc * ^k/ n 2k can used as estimates of pi k and p2k> respectively, in 

(2.3). On the other hand, by Jensen’s inequality we have 

1 w ik Pik O-Pik) * Pi(l-Pi) » (2.7) 

k 

where Pi » X w ik Pik* Hence from (2.3) and (2.7) we see that the simple 
k 

"binomial variance" estimate, 

PiO - Pi). (2.8) 

A 

provides an estimate of Var(pi) that is, at worst, an ove r-es t imat * . When K is 
large and some of the t k are small (2.8) is often a better estimate of the vari- 

A 

ance of Var(pi) than the one obtained by substituting the sample proportions, 
Pik, for p ik in (2.3). 




y 
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From part (b) of Lemma 3 it is evident that estimates of the variances and 

A a n 

covariances of and P 2 under the NCH model all require estimates of £ Var(Ai<.), 

k 

which, in turn, involves the estimates of the variance of the NCH variate, A^. 

Harkness (1965) d. scusses the moments of in the NCH case. I do not know 

whether or not those results can be used to give valid estimates of 2 Var(A^) 

k 

that are needed to use the NCH part of Lemma 3 . 

2. 2 An application to a "Mantel-Haenszel adjusted risk difference" . 

A 

In biomedical applications, provides an adjusted estimate of the rela- 

tive odds of getting a disease in an exposed group of individuals compared to an 
unexposed group. The adjustment is for differences in the distribution of 
potential confounding variables that may exist between the two groups. 

A 

Holland (1985) discussed the use of as an adjusted measure of "bias” in 

A 

test questions. In this use of the "getting the disease" is replaced by 

"getting the test item right" and the "exposed" and "unexposed" groups are 
replaced by a reference and a focal group of examinees (i.e. White and Blacks or 
Males and Females). The adjustment is for overall test performance. Since that 
suggestion, the use of the Mante 1-Haenszel procedure to measure "item bias" has 
become wide-spread at testing organizations such as Educational Testing Service. 
In these testing applications, there is an interest in expressing the estimated 

A 

logit differences, 9^, in terms of the probability scale as an adjusted dif- 

A 

ference in proportions (eg. Dorans and Kulick, 1986). One way of expressing 
in the "p-scale" is the following statistic that in biometric terms might be 
called the "Mantel-Haenszel adjusted risk difference", 



a P2 ex p{8 MH } 

rd mh - P2 - — ; ; 7* — r 

d-p 2 ) + p 2 ex p 1 0 MH i 



(2.9) 




u 
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The second terra in the right-hand-side of (2.9) is the value of that would be 

A 

obtained if P2 were "adjusted’ 1 by 0 MH , i.e. if pi were chosen to solve for pi in 
the equation, 



®MH 



log( 



l~Pl 1-P2 



( 2 . 10 ) 



The raw risk difference, P2 - Pi, makes no use of the matching or st rat if icat ion 
that is available and, for this reason, is not of much practical value except in 
special circumstances. The Mantel Haenszel adjusted risk difference is based on 
the stratification and is therefore a type of "standardized” risk difference. 

If from (2.9) is used, the need for its standard error arises and 

the results of sections 1 and 2.1 may be used to obtain an estimate of the variance 
of RD^ under the 2B model. This is given in Theorem 3, below. 

Theorem 3 : Under the 2B model , the common odds-rat io assumpt ion and any cond i- 

A ~ 

t ions that insure the approximation of b£ 0^H , the variance of RD^jpj Ln (2.9) 

is estimated by 



1 

n 2 



( l-G) 2 p 2 ( 1-P2 ) + G 2 [p 2 U-P2)] 2 



° 2 (9mh) + " 2 G(l-G) P2U-P2) 



(2.13) 



where 

G - H'mh/0-P2 + P2) 2 

A A A 

and 0 2 (6mh) _L5. t ^ ie ^Q^ins-Bres low-Greenland variance e stimate of 0 ^h given in 

( 2 . 2 ). 

A 

Proof : Let f(0MH> Pi > P2) defined by RD^h in (2.9). Then the relevant 

derivatives from Theorem 2 are easily shown to be 

m ■ - 5^ - 0. and §| 2 - 1 - G. (2.14) 

These derivatives when combined, via Theorem 2, with the covariance in Theorem 
lb, and using (2.8) instead of (2.3) yields the result. QED . 
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3. DISCUSSION 

The technique used to prove Theorem 1 (i.e. Lemma 1 and 2) is useful in its 
own right since it is simple and yet widely applicable to computations involving 
the Mante 1-Haenszel estimator. In addition, I think it is rather remarkable 

A ^ 

that the asymptotic covariances of log(^jf^) and the p£ are as simple as they 
appear in Theorem 1. 

The adjusted risk difference, RD MH , is of potential value in those applica- 
tions where at least one of the sample rates, say p 2 , is descriptive of a rele- 
vant population rate. This occurs in the testing applications referred to 
earlier but may also arise in prospective epidemiological studies as well. The 
variance estimate in Theorem 3 is asymptotically valid whenever the 

A 

Robins-Breslow-Greenland estimate of the variance of is valid with the added 

proviso that the "binomial variance estimate", (2.8), be an appropriate estimate 

A 

of the variance of P 2 - Thus the variance estimate in Theorem 3 will be most 

useful in the so-called "sparse-data" case where K is large and the t k are not. 

In the large stratum case, i.e. when K is small and the t k are large, it may be 
better to use formula (2.3) to estimate the variance of P 2 - This substitution 
would only change the first term of formula (2.13). 

I have not performed a small sample study of the behavior of the variance 
estimate in Theorem 3, but because of its close connection to the 

A 

Robins-Breslow-Greenland variance estimate for I would expect it to perform 

quite well in both the sparse data and the large stratum cases. 

To extend Theorem 3 to the NCH model it would be necessary to have a useful 

estimate of Z Var(A k ) for the NCH case in order to apply Lemma 2(b). I do not 

k 

know of any results in this area. However, in the NCH model, while the sample 
marginal rates p^ and p 2 are still defined by (1.6), it is not clear what mean- 
ing to attach to them as estimates of population rates. Thus, RD^jh may not be a 
useful parameter in the NCH case. 
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